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ABSTRACT 

/ A series of computer programs- and routines designed 

to assist researchers in the analysis of language usage was developed 
by the Southwest Regional Laboratory (SWRL) . This document is one of 
a series that describes design specifications for the individual' 
modules which comprise the Language Analysis Package (LAP). The 
C-bntent Module functions as a semantic content analysis module by 
allowing the user to construct any number of dictionary files using 
pKras^s and/or single words. EachMictionary file will represent a 
^set of user-defined semantic categories. The program will "score" the 
input text by matching it against the dictionary and it will give the 
user the total number of different categories a particular word or 
phrase falls intOr as well as the castegories into which it falls. 
Program, characteristics, data file specifications, and computer file 
layout are provided. (Author/DGC) ^ 
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ABSTRACT 

This is one of a series of technical design specifications for 
individual modules of the Language Analysis Package (L.A.P.)* 

The .Content Module will allow the user to specify semantic categories 
and to obtain an analysis of the input text in terms of those categories. 
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DESIGN DOCUMENT: CONTENT MODULE - L.A.P. VERSION I 

This document is one of a series of prograaraning design specifi- 
cations for individual modules of the Language Analysis Package 

1 ^ • 

(L.A.P.) • The section of the systdn design to which it is related 

.J « ' - 

is 6.7.0. 

Program Objective 

Version I of J:he Content Module will function as a semantic 
content analysis module by allowing the user to construct any number 
of sorted dictionary files using phrases and/or single words. Each 
dictionary file will represent a set of user-defined semantic cate- 
gories. The program will "score" the input text by matching if 
against the dictionary and give the user the total number of different 
categories a particular word or pLrase falls into, as well as the 
categories in which it falls 

1. See Porch, Ann. TM 5-72-06 "Language Analysis Package (L.A.P.) 
System Design (Version I)" for an overview of the package. 

2. A modification of an existing program will be used for the 
Version I Content Module* The program is SCORTXT, developed by 
Gerald Fisher of the University of Connecticut. Full documentation 
on the program may be found in Fisher, Gerald. "The SCORTXT 
Program for the Analysis of Natural Language", University of 
Connecticut, Bureau of Educational Research. 
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Constraints and Limitations 

- When the text is scanned for dictionary matches, only the 
longest string will be matched. For example, if the dictionary 
contains both "not very much" and "not very," the text phrase 
"not very much" will be counted as matching only "not very much"* 

- No punctuation (except apostrophes and hyphens) may be included 
in dictionary phrases. Texts T.onger than 1,500 words must be broken 
into sub-texts if high-frequency words are category entries. 

Options and Defaults \ , , 

Options and'defaults for the Content Module will be as follows: 

- Print text in original form (Default = no print) 

- Print text in array foi^m (De'fault = no print) 

- Print sorted dictionaries (Default « no print)' 

^ Print a reduced text (Default * no print) . 

- Print an item analysis of each category (Default = print) 

- User specified word length for text in array form 
(Default =16) 

- User specified input record margins for text (Default = 1,72) 

Data File Specifications 

Input files for Version I must follow the ordering shown 
in Appendix A. Dictionary input will consist of one or more dictionaries, 

Output for Version I will consist of punched or. printed 
output- as shown in Appendix B. 
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Significant Algorlthme 
Category Indicator Algorithm: 

After a dictionary is read in and sprted internally into alpha- 
betical sequence, a bit string will be created for each dictionary 
entry with on-off indicators for each category represented. Thus 
if the dictionary has 1,0P0 entries^wHich fall variously under five 
categories, then for each of the 1,000 entries a bit string of length 
five will be created with I's iM each category position to which the 
entiry belongs. The sorted dictionary and the dictionary bit strings 
are added to the file DICT. 

Significant Variables 

There are no variables of speci^l^^tgni'f icance associated with 
the Content Module. 

Error and Other Messages 

The following messages are printed out by the Content Module: 

/ 

- !*Dictionary Not Found" if there is no dictionary associated 
with the input for a particular run* ^ 

- "End of Job" if the run terminates normally when there are no 
further texts to be read. 

Called by and/or Call 

The Content Module is called only by the Control Module. 

The Content Module will call the following internal subroutines: 

- ARRAY 

- DINIT 
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- INDXDLM 

A 

- ITEM 

- LSTAT 

- MAKEDIC 
SORT 

- PHRASE 

- INTABLE. ' 

I 
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